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Capacity Requirement of a Mail Sorting Device x 

B. K. Bender and A. J. Goldman 

A mathematical model of a sorting device suggested by S. Henig is considered. The 
relevant parameters are r (the number of destinations for the mail) and A; (the number of 
letters entering the device during each cycle of operation). It is shown that the device 
will never jam if its capacity is at least (r—l)(k—l)-\-k letters; this is the lowest possible 
capacity requirement, and for realistic values of r and k is significantly less than the pre- 
viously known estimate r 2 k. 



1. Introduction 

We deal with a highly idealized mathematical 
model of a mail sorting device suggested by S. Henig 
(NBS Electronic Instrumentation Section) . For our 
purposes, the operation of the device can be described 2 
as follows. Mail (to any of r destinations) enters 
the system. After k letters have entered, the device 
"asks itself" to which destination it contains the 
most letters. 3 All letters to this predominant destina- 
tion are then dropped out of the device, another k 
letters enter, and the process continues. 

The device is said to jam if, after a dropout, it is 
still so "full" that entrance of the next k letters would 
cause an "overflow". We will determine the mini- 
mum capacity required of the device to ensure that 
no jamming occurs. This is of course equivalent to 
determining (in terms of r and k) the maximum pos- 
sible contents of the device under the dropout rule 
described above, and we shall work with this alter- 
native version of the problem. The derivation in- 
volves nothing more complicated than counting up 
letters, so that nonmathematician readers should be 
able to follow the arguments. 

2. Statement of Results 

In order to describe our results, it is convenient 
to define 

x(t) = number of letters in the device just before the 
£th dropout. (1) 



A. Bruce Clarke 4 has shown that 
x(t)<r 2 k, for all t, 



(2) 



which proves that the number of letters in the device 
never exceeds r 2 k, so that the device will never jam 
if its capacity is r 2 k or more. Clarke (see footnote 4) 
remarks that "this upper bound for x(t) is clearly much 
too crude to be of any practical use;" for example, for 



i Part of a project supported by the Post Office Department, Office of Research 
and Engineering. 

2 The physical device can also operate under "dropout rules" other than the 
one described below. 

a A rule for breaking "ties" between destinations is also required. 

« A. Bruce Clarke, A mathematical model for the Henig sorting machine, partly 
abstracted in Ann. Math. Statistics 29, 622 (1958). 



the Philadelphia-mail-type data (with r=200, k=3) 
used for numerical illustration, it only informs us 
that the device will never jam if its capacity is 
120,000 or more. 

As was pointed out by L. S. Joel (NBS Computa- 
tion Laboratory), the estimate (2) can be very much 
reduced using rather simple arguments. It will in 
fact be shown that 



x(t)<(r-l)(k-l)+k, for all t, 



(3) 



which proves that the number of letters in the device 
can never exceed (r— l)(k— l) + fc, so that a capacity 
of at least (r— l)(k— 1)+ k will ensure that the device 
never jams. For the Philadelphia-type data, this 
shows that a capacity of 401 (rather than 120,000) is 
sufficient to prevent jamming. The proof of inequal- 
ity (3) will be given in section 3. 

In addition, it will be shown that the device can 
jam if its capacity is less than (r— l)(k— l)-\-k; that 
is, it is possible (under the dropout rule described 
above) for the number of letters in the device actually 
to reach (r—l)(k—l)+k. In other words, the esti- 
mate in (3) is not only an upper bound for x(t) but is 
the maximum possible value for x(t) : 



x(t) n 



ii=(r— !)(*—!)+* 



(4) 



This will also be proved in section 3. Section 4 con- 
tains some comments on the probabilistic aspects of 
the problem. 

3. Proofs 

3.1. Proof of Inequality (3) 

It is assumed, of course, that the device had fewer 
than (r— 1) Qc— 1) +k letters in it at the start of opera- 
tion. An "indirect proof" will be used; that is, we 
can tentatively suppose that (3) is false and show 
that this supposition is untenable. 

If (3) is false, there must be a first value to of t (i.e., 
a first dropout) for which it fails. Since (3) is false 
for t=t but true for t=to— 1, we have 



z(*o)>(r--l)(£--l)+&, 
x(t Q -l)<(r-i)(k-l)+k. 



(5) 
(6) 
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We continue the argument by pointing out two 
facts about the contents of the device directly after the 
(to—1) -st dropout : 

(a) There are more than (r— l)(k — 1) letters in the 
device, and 

(b) There are at least k letters in the device to 
some one destination. 

The truth of (a) follows directly from (5) and the 
fact that only k letters were added between the 
(fo— l)-st and to-th dropouts. All letters to some 
destination left the device in the (fo— l)-st dropout, 
so that if (b) were false the device would contain at 
most (k—1) letters to each of at most (r— 1) destina- 
tions, or at most (r—l)(k—l) letters in all, con- 
tradicting (a). Thus (b) is true. 

From (b) and the dropout rule it can be seen that 
the (t — l)-st dropout consisted of at least k letters, 
so, examining the contents of the device just before 
the (fo— l)-st dropout, we have 

(number of letters to predominant dest.)>&, 
and by (a), 

(number of letters to other dest.'s)Xr— 1) )k—l); 
the last two inequalities together yield 



x(t Q -l)>(r-l)(k-l)+k, 



(7) 



which contradicts (6). Thus the supposition that (3) 
is false is untenable, so (3) must be true. 

3.2. Proof of Equation (4) 

We already know, by (3), that the device can never 
contain more than (r—l)(k—l)-\-k letters. Thus, in 
order to prove (4), it suffices to exhibit a sequence of 
possible events leading to a situation in which the 
device contains exactly (r—l)(k—l)-{-k. There are 
many such sequences, and the one described below is 
not necessarily the shortest one. 

The sequence will be constructed in two stages, 
beginning with the device empty at £=0. At the 
end of stage 1, the device will contain k—1 letters to 
each of 2 destinations and k—2 letters to each of the 
other r— 2 destinations. At the end of stage 2, the 
device will contain exactly (r—l)(k—l)-\-k letters 
and the proof of equation (4) will have been 
completed. 

Stage 1: We suppose that the r destinations are 
numbered in some way. The sequence of possible 
events begins as follows : 5 The first k letters entering 
the device consist of one letter to the first destina- 
tion and k—1 letters to the rth destination; these 
last (k—1) letters then leave the device in the first 
dropout. The second k letters to enter the device 
^consist of one letter to the second destination and 
k—1 letters to the rth destination, etc. After the 
(r— l)-st dropout, the device contains no letters to 
the rth destination and one letter to each other 
destination. The rth set of k letters to enter the 
device consists of one letter to the first destination 



8 For a more formal description of the following process, let R(t) denote the 
remainder when t is divided by r— 1, except that R(t) = r— 1 when t is a multiple 
of r— 1. The for l<t<(r— l)(k— 2), the k letters entering the device between 
the (t— l)-st and tfh dropouts are to consist of one letter to the jR(0-th destination 
and k—1 letters to the rth destination. 



and k—1 letters to the rth destination, and the 
process continues as before; after the (r— l)(k— 2)-nd 
dropout the device contains no letters to the rth 
destination and k—2 letters to each of the (r— 1) 
other destinations. 

The k letters entering the device between the 
(r—l)(k— 2)-nd and ((r-l)(&-2) + l)-st dropouts 
consist oi k—1 letters to the rth destination and one 
letter to some other destination; the device now 
contains k—1 letters to each of 2 destinations and 
k—2 letters to each of the other r — 2 destinations. 
In other words, if we define 

w(£)== number of destinations to which there are^j 

k—1 letters in the device just before the> (8) 
tih dropout, J 



then we have 



n((r—l)(k-2) + l) = 2. 



(9) 



The ((r—l)(k—2)-\-l)-st dropout will consist of 
k—1 letters to whichever of the 2 destinations 
mentioned above is "preferred" by whatever tie- 
breaking rule is employed. After this dropout the 
device will contain no letters to some one destina- 
tion, k—1 letters to some other destination, and k—2 
letters to each of the r— 2 remaining destinations. 

Stage 2: In this stage matters will be arranged so 
that 

n(t)=n(t-l) + l (10) 

for (r- 1) (k—2) + 2<t<(r-l) (k- 1) ; 

i.e., so that after each dropout there are k—1 letters 
in the device to one more destination than before. 
If this is done, then by (9) and repeated application 
of (10) we have 

n((r-l)(k-l))=n((r-l)(k-2) + (r-l))=r y 

so that just before the (r— l)(k— l)-st dropout the 
device contains k—1 letters to each of the r destina- 
tions. Then k— 1 of these r(k—l) letters are dropped 
leaving (r—l)(k—l) in all, and k new letters enter; 
the device now contains (r—l)(k—\)-\-k letters, and 
equation (4) is proved. 

To achieve (10) we choose the k letters entering 
the device between the (t— l)-st and tih dropouts, 
for t=(r-l)(k-2)+2, (r-l)(&-2) + 3, . . ., 
(r—l)(k—l), to consist oi k—1 letters to the desti- 
nation just previously dropped out, and one letter 
to a destination which previously had k—2 letters 
to it in the device. 

4. Probabilistic Aspects of the Capacity 
Problem 

We have been concerned above with choosing the 
capacity of the device so that jamming never occurs. 
In practice we might well have the more modest 
aim of keeping the frequency of jamming down to 
some tolerable level, and so could get by with a 
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smaller capacity than that indicated by eq (4). On 
the one hand this leads to the "policy" question of 
what jamming frequencies are to be considered 
intolerable; on the other hand, it leads to the purely 
mathematical problem of what capacity is needed in 
order to keep jamming below these frequencies. 

This last problem seems very difficult. As was 
pointed out (unpublished memorandum, July 1956) 
by J. R. Rosenblatt (NBS Statistical Engineering 
Laboratory), the behavior of the device constitutes 
a stochastic process (indeed, a Markov chain) 
governed by the probabilistic distribution of mail by 
destinations. The possible number of "states" of 
the device is very large (for realistic values of r and 
k), and the creation of analytic techniques for the 
treatment of Markov chains with a great many 
states appears to be one of the "underdeveloped 
areas" of applied mathematics. We know that the 
specific Markov chain in the current problem has a 
probabilistic "steady state" toward which the device 
"settles down," and the average number E(x) of 
letters in the device before dropouts, in this steady 



state, might serve as a very rough initial guide 6 to 
the capacity required of the device. Even the de- 
termination of E(x) seems far from easy; we may 
note that the upper bound derived for E(x) by a 
complicated probabilistic argument (see reference 
cited in footnote 4) yields the result 2?(;z)<545 for 
the particular illustrative data used, whereas our 
eq (4) (which was not aimed at estimating E(x) and 
does not use the probabilistic distribution of mail by 
destinations) can be combined with the inequality 

1^[X) f^Wmax poss 

to yield the sharper result £ , (x)<401 for the same 
data. 



e The frequency of large-amplitude oscillations around this average must also 
be considered; at the very Least, sonic information is required on the standard 
deviation of the number of letters in the device. 



Washington, January 5, 1959. 
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